Security News
38% of CISOs Fear They’re Not Moving Fast Enough on AI
CISOs are racing to adopt AI for cybersecurity, but hurdles in budgets and governance may leave some falling behind in the fight against cyber threats.
html-escaper
Advanced tools
The html-escaper npm package is designed to escape and unescape HTML entities. It is useful for preventing XSS attacks by sanitizing user input or for properly displaying text that includes characters that are reserved in HTML.
Escape HTML
Escapes HTML entities to prevent execution of potentially harmful scripts and to display HTML tags as plain text.
const { escape } = require('html-escaper');
const escapedString = escape('<div>Hello & Welcome!</div>');
console.log(escapedString); // <div>Hello & Welcome!</div>
Unescape HTML
Unescapes HTML entities to convert them back to their original characters, which is useful for rendering text that was previously escaped.
const { unescape } = require('html-escaper');
const unescapedString = unescape('<div>Hello & Welcome!</div>');
console.log(unescapedString); // <div>Hello & Welcome!</div>
The 'he' package is a robust HTML entity encoder/decoder written in JavaScript. It supports all named character references defined in HTML, handling even obscure and rare entities. It is more comprehensive than html-escaper but might be larger in size.
The 'escape-html' package is a simple and fast utility for escaping HTML entities. It is smaller and more lightweight than html-escaper, but it only provides the escaping functionality and does not include unescaping.
The 'entities' package is another library for encoding and decoding HTML entities. It supports a wide range of character entities and offers both escaping and unescaping functionalities, similar to html-escaper, but with additional options for handling different document types.
A simple module to escape/unescape common problematic entities.
If you'd like to deal with any kind of input, including null
or undefined
, and even symbol
kind, check html-sloppy-escaper out: it's this very same module, except it never throws errors 👍
The version 3 of this module ditches entirely legacy browsers and nodejs with broken loaders, such as v13.0.0
and v13.1.0
.
As the code is basically identical, simply stick with version 2 if you have any issue with this one 👋
This package is available in npm so npm install html-escaper
is all you need to do, using eventually the global flag too.
Once the module is present
import {escape, unescape} from 'html-escaper';
escape('string');
unescape('escaped string');
there is basically one rule only: do not ever replace one char after another if you are transforming a string into another.
// WARNING: THIS IS WRONG
// if you are that kind of dev that does this
function escape(s) {
return s.replace(/&/g, "&")
.replace(/</g, "<")
.replace(/>/g, ">")
.replace(/'/g, "'")
.replace(/"/g, """);
}
// you might be the same dev that does this too
function unescape(s) {
return s.replace(/&/g, "&")
.replace(/</g, "<")
.replace(/>/g, ">")
.replace(/'/g, "'")
.replace(/"/g, '"');
}
// guess what we have here ?
unescape('&lt;');
// now guess this XSS too ...
unescape('&lt;script&gt;alert("yo")&lt;/script&gt;');
The last example will produce <script>alert("yo")</script>
instead of the expected <script>alert("yo")</script>
.
Nothing like this could possibly happen if we grab all chars at once and either ways.
It's just a fortunate case that after swapping &
with &
no other replace will be affected, but it's not portable and universally a bad practice.
Grab all chars at once, no excuses!
more details
As somebody might think it's an unescape
issue only, it's not. Being an anti-pattern with side effects works both ways.
As example, changing the order of the replacement in escaping would produce the unexpected:
function escape(s) {
return s.replace(/</g, "<")
.replace(/>/g, ">")
.replace(/'/g, "'")
.replace(/"/g, """)
.replace(/&/g, "&");
}
escape('<'); // &lt; instead of <
If we do not want to code with the fear that the order wasn't perfect or that our order in either escaping or unescaping is different from the order another method or function used, if we understand the issue and we agree it's potentially a disaster prone approach, if we add the fact in this case creating 4 RegExp objects each time and invoking 4 times .replace
trough the String.prototype
is also potentially slower than creating one function only holding one object, or holding the function too, we should agree there is not absolutely any valid reason to keep proposing a char-by-char implementation.
We have proofs this approach can fail already so ... why should we risk? Just avoid and grab all chars at once or simply use this tiny utility.
Internt explorer < 9 has some backtick issue
For compatibility sake with common server-side HTML entities encoders and decoders, and in order to have the most reliable I/O, this little utility will NOT fix this IE < 9 problem.
It is also important to note that if we create valid HTML and we set attributes at runtime through this utility, backticks in strings cannot possibly affect attribute behaviors.
var img = new Image();
img.src = html.escape(
'x` `<script>alert(1)</script>"` `'
);
// it won't cause problems even in IE < 9
However, if you use innerHTML
and you target IE < 9 then this might be a problem.
Accordingly, if you need more chars and/or backticks to be escaped and unescaped, feel free to use alternatives like lodash or he
Here a bit more of my POV and why I haven't implemented same thing alternatives did. Good news: those are alternatives ;-)
FAQs
fast and safe way to escape and unescape &<>'" chars
The npm package html-escaper receives a total of 18,526,803 weekly downloads. As such, html-escaper popularity was classified as popular.
We found that html-escaper demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
CISOs are racing to adopt AI for cybersecurity, but hurdles in budgets and governance may leave some falling behind in the fight against cyber threats.
Research
Security News
Socket researchers uncovered a backdoored typosquat of BoltDB in the Go ecosystem, exploiting Go Module Proxy caching to persist undetected for years.
Security News
Company News
Socket is joining TC54 to help develop standards for software supply chain security, contributing to the evolution of SBOMs, CycloneDX, and Package URL specifications.